Average multi microbenchmarks results by VincentBu · Pull Request #5215 · dotnet/performance

VincentBu · 2026-05-01T07:37:34Z

This PR aims at calculating average value of multiple microbenchmarks results. The work revolves around:

Reduce memory usage.
Change namespace of some classes and rename them for future work.

…ks when creating suites

…ust namespaces

Copilot

Pull request overview

This PR updates the GC microbenchmark infrastructure to support aggregating (averaging) results across multiple microbenchmark runs/iterations, while also renaming/refactoring parts of the analysis/presentation pipeline and introducing an outlier-removal helper.

Changes:

Add configurable microbenchmark iteration count (iterations) and wire it into suite creation and execution.
Replace the previous single-result comparison flow with a new per-benchmark aggregation/comparison pipeline (MicrobenchmarkResultComparison, GCTraceMetrics, GCTraceMetricComparisonResult).
Refactor output generation to primarily emit JSON (markdown generation currently disabled).

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 18 comments.

Show a summary per file

File	Description
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/CreateSuiteCommand.cs	Reads configured iteration count and applies it to microbenchmark suite environment.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/MicrobenchmarksToRun.txt	Updates baseline suite benchmark list.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/Microbenchmarks.yaml	Renames environment iteration setting to `iterations`.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/Microbenchmark/MicrobenchmarkCommand.cs	Runs microbenchmarks for `iterations` and switches to new aggregation/comparison logic before presenting results.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/Microbenchmark/MicrobenchmarkAnalyzeCommand.cs	Updates analysis-only command to use the new aggregation/comparison logic.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Presentation.cs	Changes presentation API to accept precomputed grouped results; markdown output path currently disabled.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Markdown.cs	Markdown generation code is commented out.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json/JsonOutput.cs	Removes unused placeholder type.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json.cs	Moves JSON generator to Microbenchmarks presentation namespace and updates signature for grouped results.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Configurations/Microbenchmarks.Configuration.cs	Renames `iteration` to `iterations` in microbenchmark environment configuration.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Configurations/InputConfiguration.cs	Adds `iterations` map to input configuration.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultsAnalyzer.cs	Removes old analyzer/comparison pipeline.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultComparison.cs	Adds new JSON/trace mapping, per-benchmark analysis, and aggregation/grouping logic.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResult.cs	Introduces new MicrobenchmarkResult model (namespace currently mismatched vs usage).
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkComparisonResult.cs	Updates comparison to support averaged values/outlier removal and new trace-metric comparisons.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetrics.cs	Adds trace-derived metric extraction (includes reflection/stat bugs).
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparisonResult.cs	Adds averaged comparison for trace metrics (baseline vs comparand).
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparison.cs	Adds helper wrapper for metric comparison construction.
src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/BdnJsonResult.cs	Refactors BDN JSON model types; renames top-level to `BdnJsonResult`.
src/benchmarks/gc/GC.Infrastructure/GC.Analysis.API/Statistics.cs	Adds `RemoveOutliers` helper (IQR method).
src/benchmarks/gc/GC.Infrastructure/Configurations/Run.yaml	Adds iteration configuration block (currently mismatched with new `iterations` input model).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 13 comments.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 11 comments.

…chmarks namespace

…ereIsGen1

…eMetric

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.

Comments suppressed due to low confidence (1)

src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json.cs:15

Json.Generate doesn’t use the configuration parameter, and the using GC.Analysis.API; / using GC.Infrastructure.Core.Presentation.GCPerfSim; directives are unused. Consider removing the unused parameter/usings to avoid warnings and keep the API surface minimal.

…for each projection

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultComparison.cs:155

When trace collection is enabled, tracePath falls back to an empty string if the JSON path isn’t present in jsonToTraceMap, and that empty path is passed to AnalyzerManager.GetAnalyzer(...), which will fail with a less actionable error. Prefer validating the mapping result (e.g., TryGetValue) and throwing an exception that includes the missing JSON path / benchmark name, or ensuring MapJsonToTrace guarantees coverage for every jsonPath.

                if ((!excludeTraces) && configuration.TraceConfigurations.Type != "none")
                {
                    string outputPathForRun = Path.Combine(configuration.Output.Path, run.Name!);
                    string tracePath = jsonToTraceMap.GetValueOrDefault(jsonPath, "");

                    using (var analyzer = AnalyzerManager.GetAnalyzer(tracePath))
                    {

src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Markdown.cs:181

GetValueOrDefault(column) on Dictionary<string,double> returns 0 when the metric is missing, so baselineValue.HasValue is always true and missing metrics render as 0 (and deltaPercent can divide by 0). This can silently produce incorrect tables when a column can’t be computed (e.g., GC-derived columns when traces are unavailable). Use TryGetValue and treat missing metrics as null/empty, and guard the percent calculation when the baseline value is 0.

                        foreach (var column in configuration.Output.Columns)
                        {
                            double? baselineValue = lr.AveragedBaselineOtherMetrics.GetValueOrDefault(column);
                            double? comparandValue = lr.AveragedComparandOtherMetrics.GetValueOrDefault(column);

                            string baselineResult = baselineValue.HasValue ? Math.Round(baselineValue.Value, 4).ToString() : string.Empty;
                            string comparandResult = comparandValue.HasValue ? Math.Round(comparandValue.Value, 4).ToString() : string.Empty;
                            double? delta = baselineValue.HasValue && comparandValue.HasValue ? comparandValue.Value - baselineValue.Value : null;
                            string deltaResult = delta.HasValue ? Math.Round(delta.Value, 4).ToString() : string.Empty;

                            double? deltaPercent = delta.HasValue ? (delta / baselineValue.Value) * 100 : null;
                            string deltaPercentResult = deltaPercent.HasValue ? Math.Round(deltaPercent.Value, 4).ToString() : string.Empty;

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/BdnJsonResult.cs:25

The BDN JSON POCOs declare many non-nullable reference-type properties (e.g., HostEnvironmentInfo.BenchmarkDotNetCaption, Benchmark.Measurements, BdnJsonResult.Benchmarks) without initialization. With <Nullable>enable</Nullable>, this produces CS8618 warnings and also makes it easy for deserialization to yield nulls at runtime if fields are absent. Consider making these properties nullable where appropriate or initializing with = null!; / empty collections for required fields.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

VincentBu and others added 5 commits April 23, 2026 14:18

add iterations section for end-2-end config and set for microbenchmar…

9db4f30

…ks when creating suites

Merge branch 'dotnet:main' into average-microbenchmarks-iterations

a730074

add json-trace map and implement AnalyzeForBenchmark

9802484

Calculate comparison result by benchmark name, rename classes and adj…

f2318ac

…ust namespaces

present list of microbenchmarkresults

e34745b

Copilot AI review requested due to automatic review settings May 1, 2026 07:37

Copilot started reviewing on behalf of VincentBu May 1, 2026 07:38 View session

Copilot AI reviewed May 1, 2026

View reviewed changes

VincentBu commented May 1, 2026

View reviewed changes

Comment thread ...c/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/MicrobenchmarksToRun.txt

VincentBu commented May 1, 2026

View reviewed changes

Comment thread src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetrics.cs

VincentBu commented May 1, 2026

View reviewed changes

Comment thread ...hmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparisonResult.cs

VincentBu commented May 1, 2026

View reviewed changes

Comment thread src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/BdnJsonResult.cs

rename iteration section to iterations for Run.yaml

e9594b3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 6, 2026 05:25

Copilot started reviewing on behalf of VincentBu May 6, 2026 05:25 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

Potential fix for pull request finding

c3f4b45

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 6, 2026 06:26

Copilot started reviewing on behalf of VincentBu May 6, 2026 06:27 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

VincentBu added 8 commits May 6, 2026 14:41

fix for microbencharmks comparison

240ceb8

fix bugs intrduced in previous commit

1b08bd0

Add json-only comparison

434a17c

extract a shared helper for analyze command

faca628

take trace type into consideration

ab17bf3

move MicrobenchmarkResult to GC.Infrastructure.Core.Analysis.Microben…

70c7f16

…chmarks namespace

validate if run is null

d9b160d

rename PauseDurationSeconds_SumWhereIsGen1 to PauseDurationMSec_SumWh…

2ba9f6b

…ereIsGen1

VincentBu marked this pull request as draft May 7, 2026 09:24

Copilot AI review requested due to automatic review settings May 7, 2026 09:25

VincentBu added 6 commits May 15, 2026 14:09

includes int type properties

6ba2219

check key existence and set parallelism degree to 2 * cpu_count

cc59167

check if metricName is a key of StatsData

a7c1b2c

Filter out null GCTraceMetrics instances before calling CompareGCTrac…

357ce0a

…eMetric

update initialization of OtherMetrics for MicrobenchmarkComparisonResult

3d1c330

comment out cpu_columns related code

eb9f7e3

Copilot AI review requested due to automatic review settings May 15, 2026 08:01

Copilot started reviewing on behalf of VincentBu May 15, 2026 08:01 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

VincentBu requested a review from Copilot May 15, 2026 09:15

Copilot started reviewing on behalf of VincentBu May 15, 2026 09:15 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

VincentBu added 7 commits May 15, 2026 17:27

remove unused imports

7d5709a

"Microbechmark" (missing 'n')

994ea7a

avoid breaking formatting

ac9ec48

sort improvements in ascending order

b7bfe3d

Convert to ToDictionary(x => x.Item1, x => x.Item2) (or tuple names) …

08ec2a5

…for each projection

provide key value pair projection

722ace7

sort in order by index

cf32db5

VincentBu requested a review from Copilot May 18, 2026 06:52

Copilot started reviewing on behalf of VincentBu May 18, 2026 06:52 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

remove Spectre markup tokens from output path

766fc82

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 18, 2026 07:05

Copilot started reviewing on behalf of VincentBu May 18, 2026 07:06 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

Check if first baseline/comparand is null

2ba985f

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings May 18, 2026 07:32

Copilot started reviewing on behalf of VincentBu May 18, 2026 07:32 View session

Copilot AI reviewed May 18, 2026

View reviewed changes

Conversation

VincentBu commented May 1, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!